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ABSTRACT 



Context. Whenever correlation functions are used for inference about cosmological parameters in the context of a Bayesian analysis, 
the likelihood function of correlation functions needs to be known. Usually, it is approximated as a multivariate Gaussian, though this 
is not necessarily a good approximation. 

Aims. We show how to calculate a better approximation for the probability distribution of correlation functions, which we call "quasi- 
Gaussian". 

Methods. Using the exact univariate PDF as well as constraints on correlation functions previously derived, we transform the cor- 
relation functions to an unconstrained variable for which the Gaussian approximation is well justified. From this Gaussian in the 
transformed space, we obtain the quasi-Gaussian PDF. The two approximations for the probability distributions are compared to the 
"true" distribution as obtained from simulations. Additionally, we test how the new approximation performs when used as likelihood 
in a toy-model Bayesian analysis. 

Results. The quasi-Gaussian PDF agrees very well with the PDF obtained from simulations; in particular, it provides a significantly 
better description than a straightforward copula approach. In a simple toy-model likelihood analysis, it yields noticeably different 
results than the Gaussian likelihood, indicating its possible impact on cosmological parameter estimation. 
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1. Introduction 

Correlation functions are an important tool in cosmology, since 
they constitute one standard way of using observations to con- 
sttain cosmological parameters. The two-point correlation func- 
tion %(x) obtained from observational data is usually analyzed in 
the framework of Bayesian statistics: Using Bayes' Theorem 



P(0\O = 



UQ)-p(Q) 
p(0 ' 



(i) 



the measured correlation function £ is used to quantify how the 
optimal parameters 9 of the cosmological model change. It is in- 
herent to Bayesian methods that we need to "plug in" the prob- 
ability distribution function (PDF) of the data involved. In the 
case of correlation functions, this likelihood £.(6) = p(£\G) is 
usually assumed to be a Gaussian. For example in Seljak & 
Bertschinger (1993), the authors perform an analysis of the an- 
gular correlation function of the CMB (as calculated from COBE 
data), stating that a Gaussian is a good approximation for its like- 
lihood, at least near the peak. Also in common methods of BAO 
detection, a Gaussian distribution of correlat ion functions is usu- 
ally assumed (see e.g. Laba tie et al.ll2Q12ab . While this may be 
a good approximation in some cases, there are strong hints that 
there may in other cases b e strong deviations f rom a Gaussian 
likelihood. For example in Hartl ap et al.l (l2009h . using indepen- 
dent component analysis, a significant non-Gaussianity in the 
likelihood is detected in the case of a cosmic shear study. 

A more fundam e ntal a pproach to this is performed in 
ISchneider & H artlap (2009): Under very general conditions, it 
is shown that there exist constraints on the correlation function, 



meaning that a valid correlation function is confined to a certain 
range of values. This proves that the likelihood of the correlation 
function cannot be truly Gaussian: A Gaussian distribution has 
infinite support, meaning that it is non-zero also in "forbidden" 
regions. 

Thus, it is necessary to look for a better description of the 
likelihood than a Gaussian. This could lead to a more accu- 
rate data analysis and thus, in the end, result in more precise 
consttaints on the cosmological parameters obtained by such an 
analysis. It is also worth noting that Carron (2012) has shown 
some inconsistencies of Gaussian likelihoods: For the case of 
power spectrum estimators, using Gaussian likelihoods can as- 
sign too much information to the data and thus violate the 
Cramer-Rao inequality. 

Of course, non-Gaussian likelihood functions have been 
studie d before: For the case of the cosmic shear power spec- 
trum, ISato et a l. (2011) use a Gaussian copula to construct a 
more accurate likelihood function. The use of a copula requires 
knowledge of the univariate distribution, for which the authors 
find a good approximation from numerical simulations. 

However, in the case of correlation functions, similar work 
is sparse in the literature, so up to now, no better general ap- 
proximation than the Gaussian likelihood has been obtained. 
Obviously, the most fundamental approach to this is to try and 
calculate the probabilit y distributions of correlation functions 
analytically: iKeitel & Schneider (201 1) used a Fourier mode ex- 
pansion of a Gaussian random field in order to obtain the charac- 
teristic function of p(£). From it, they calculate the uni- and bi- 
variate probability distribution of £ through a Fourier transform. 
Since this calculation turns out to be very tedious, an analyti- 
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cal computation of higher- variate PDFs is probably not feasible. 
Still, coupling the exact univariate distributions with a Gaussian 
copula might yield a multivariate PDF that is a far better approx- 
imation than the Gaussian one - we will show, however, that this 
is not the case. 

In the main part of this work, we will therefore try a different 
approach: Using a transformation of variables that involves the 
aforementioned constraints on correlation functions derived in 
Schneider & Hartlap (2009), we will derive a way of computing 
a new, "quasi-Gaussian" likelihood function for the correlation 
functions of one-dimensional Gaussian random fields. 

It is notable that the strategy of transforming a random vari- 
able in order to obtain a well-known probability distribution 
(usually a Gaussian one) suggests a comparison to similar at- 
tempts using, for ex ample, Box-Cox transformations (see e.g. 
iJoachimi et al.ll201lb . We will argue that a Box-Cox approach 
does not seem to yield satisfactory results in our case. 

This work is structured as follows: In lSect. 21 we briefly sum- 
marize previous work and explain how to compute the quasi- 
Gaussian likelihood for correlation functions, which is then 
tested thoroughly and compared to numerical simulations in 
ISect. 31 In lSect. 41 we investigate how the new approximation of 
the likelihood performs in a Bayesian analysis of simulated data, 
compared to the usual Gaussian approximation. Our attempts to 
find a new approximation of the likelihood using a copula or 
Box-Cox approach are presented in ISect. 51 andl6l respectively. 
We conclude with a short summary and outlook in ISect. 71 

2. A new approximation for the likelihood of 
correlation functions 

2.1. Notation 

We denote a random field by g(x) and define its Fourier trans- 
form as 



m 



■/* 



x g(x) e 



-ik ■.> 



(2) 



In order to perform numerical simulations involving random 
fields, it is necessary to discretize the field, i.e. evaluate it at 
discrete grid point, and introduce boundary conditions: Limiting 
the field to an n-dimensional periodic cube with side length L in 
real space is equivalent to choosing a grid in £-space such that 
the wave number can only take integer multiples of the smallest 
possible wave number Ak = 2n/L. In order to simulate a field 
that spans the whole space R", we can choose L such that no 
structures at larger scales than L exist. In this case, the hyper- 
cube is representative for the whole field and we can extend it to 
R" by imposing periodic boundary conditions, i.e. by setting (in 
one dimension) g(x + L) - g(x). After discretization, the integral 
from |Eq. (2)| simply becomes a discrete Fourier series, and the 
field can be written as 



8(x) 



2> e " 

A- 



where we defined the Fourier coefficients 



8k 
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(3) 
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The two-point correlation function of the field is defined as 
i;(x,y) - (g(x) ■ g*(y)) ; it is the Fourier transform of the power 
spectrum: 



«W) 



r d n k 
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P(\k\) exp(ifc • x). 



In the one-dimensional case, Eq. (5) becomes a simple cosine 
transform, 
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P(\k\) cos(fct)- 



(6) 



A random field is called Gaussian if its Fourier components gk 
are statistically independent and the probability distribution of 
each gk is Gaussian, i.e. 



1 
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(7) 



A central property of a Gaussian random field is that it is entirely 
specified by its power spectrum, which is given as a function of 



P( W) =(!)VH!)v (W ). 



(8) 



In the following, we will only be concerned with one- 
dimensional, periodic Gaussian fields of field length L, evaluated 
at N discrete grid points g, = g(x,), x, = i L/N. 

2.2. Constraints on correlation functions 



As shown in ISchneider & Hartlapl (120091) (hereafter SH2009), 
correlation functions cannot take arbitrary values, but are subject 
to constraints, originating from the non-negativity of the power 
spectrum P(k). The constraints are expressed in terms of the cor- 
relation coefficients 



f (n Ax) 



(9) 



Since we will be using a gridded approach, we shall denote 
%(n Ax) = £„, where Ax = L/N denotes the separation between 
adjacent grid points. The constraints can be written in the form 



r n i(r u r 2 , . . . , r„_i) < r n < r nu (r u r 2 , . . . , r„_i), 



(10) 



where the upper and lower boundaries are functions of the r, with 
i < n. SH2009 use two approaches to calculate the constraints: 
The first one applies the Cauchy-Schwartz inequality and yields 
the following constraints: 



-l < n <l, 

-1 + 2r\ < r 2 <1, 

, (n +r„-i) 1 (n -r„_i) 2 

1 H : < r„ < 1 



1 +r„_ 



1 - r n -2 



n>2. 



(11) 

(12) 

(13) 



The second approach involves the covariance matrix C, which 
for a one-dimensional random field g, of Af grid points, reads 



Cu = 



'<#}> = &-;'!• 



(14) 



Calculating the eigenvalues of C and making use of the fact 
that they have to be non-negative due to the positive semi- 
definiteness of C allows the calculation of constraints on r, for 
arbitrary n. As it turns out, for n > 4, the constraints obtained 
by this method are different (that is, stricter) than the ones from 
the Cauchy-Schwarz inequality - e.g. for n = 4, while the up- 
per bound remains unchanged compared to |Eq. (13)| the lower 
bound then reads 



(5) >*41 = -1 + 



r\(l - 4r 2 ) + 2r 2 2 (l + r 2 ) + 2 n r 3 (l - 2r 2 ) + r\ 

\-2r\ + n 



(15) 



P. Wilking and P. Schneider: A quasi-Gaussian approximation for the probability distribution of correlation functions 



Moreover, SH2009 show that the constraints are "optimal" (in 
the sense that no stricter bounds can be found for a general power 
spectrum) for the one-dimensional case. In more than one di- 
mension, although the constraints derived by SH2009 still hold, 
stricter constraints will hold due to the multidimensional inte- 
gration in '. 



Eq. (5) and the isotropy of the field. 



One can show that the constraints are obeyed by generat- 
ing realizations of the correlation function of a Gaussian ran- 
dom field with a power spectrum that is randomly drawn for 
each realization (we will elaborate more on these simulations 
in lSect. 23i . In addition to the plots shown in SH2009, we show 
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Fig. 1. Constraints in the r3-r4-plane (i.e. with fixed r\ and r%) for 
a field with N = 16 grid points and randomly drawn power spec- 
tra. For the lower bound, the results from the Cauchy-Schwarz 
inequality (green) as well as from the covariance matrix ap- 
proach (blue) are shown. 



the r3-r4-plane for illustration, see Fig. 1 Note that the admissi- 
ble region is populated only sparsely, since the constraints on r->, 
depend explicitly on r\ and rz, so we have to fix those, resulting 
in only 845 of the simulated 400 000 realizations that have r\ 
and r2 close to the fixed values. [Fig. 1| shows the lower bounds 
r4i as calculated from the Cauchy-Schwarz inequality and from 
the covariance matrix method. One can see that the latter (plotted 
in blue) is slightly stricter than the former, shown in green. 

A central idea of SH2009 on the way to a better approxi- 
mation for the PDF of correlation functions is to transform the 
correlation function to a space where no constraints exist, and 
where the Gaussian approximation is thus better justified. This 
can be done by defining 



2f* — y — Y \ 

y n - atanh — — — = atanh (x„) . 

•no. ~ T n\ 



(16) 



The transformation to x n maps the allowed range of r n to the 
interval (-1, +1), which is then mapped onto the entire real axis 
(-oo, +oo) by the inverse hyperbolic tangent. 



2.3. Simulations 

In the following, we will explain the simulations we use to gener- 
ate realizations of the correlation function of a Gaussian random 
field with a given power spectrum. A usual method to do this is 
the following: One can draw realizations of the field in Fourier 
space, i.e. according to |Eq. (7)] where the width of the Gaussian 
is given by the theoretical power spectrum via |Eq. (8j| The field 
is then transformed to real space, and £ can then be calculated 
using an estimator based on the gj. 

We developed an alternative method: Since we are not in- 
terested in the field itself, we can save memory and CPU time 
by directly drawing realizations of the power spectrum. To do 
so, we obviously need the probability distribution of the power 
spectrum components P, = P(kj) = P(i Ak) - this can be 

since for one realization, P< = 



calculated easily from |Eq 

(27r/A£)|f,-| 2 = L|| ; | 2 



(7) 



The calculation yields 



1 



rt/^—cp — 



H(Pd, 



(17) 



where H(£) denotes the Heaviside step function. Thus, each 
power spectrum mode follows an exponential distribution with a 
width given by the model power Loj. 

For each realization of the power spectrum, the correlation 
function can then be computed using a discrete cosine transform 
given by a discretization of Eq. (6) 



2 N/2 

- 2^ P„ cos 



2mnn 
N ' 



(18) 



This method can in principle be generalized to higher- 
dimensional fields, where the savings in memory and CPU us- 
age will of course be far greater. It is clear that this new method 
should yield the same results as the established one (which in- 
volves generating the field). We confirmed this fact numerically; 
additionally, we give a proof for the equivalence of the two meth- 



ods in Appendix A 



In order to draw the power spectrum components P,- accord- 



ing from the PDF given in Eq. (17) we integrate and invert this 
equation, yielding 



Pi = -Loj In (u) , 



(19) 



where u i s a uniformly dis tributed random number with u e 
[0, 1] (see lPress et al.ll2007l for a more elaborate explanation on 
random deviates). 

As an additional improvement to our simulations, we im- 
plement a concept called quasi-random (or "sub-random") sam- 
pling; again, see Press et al. (2007) for more details: Instead 
of drawing uniform deviates u for each component P,, we use 
quasi-random points, thus ensuring that we sample the whole 
hypercube of power spectra as fully as possible, independent of 
the sample size. One way to generate quasi-randomness is by 
using a so-called SoboF sequence, see e.g. Joe & Kuo (2008) 
for explanations. Implementations for Sobol' sequences exist in 
the literature, we use a C++ program available at the authors' 
website http : //web . maths . unsw . edu . au/~f kuo/sobol/ 

To illustrate the non-Gaussian probability distribution of cor- 
relation functions, the left-hand part of Fig. 2 shows the iso- 
probability contours of p{^,^), i.e. the PDF of the correlation 
function for a lag equal to 3 and 6 times the distance between 
adjacent grid points, calculated over the 400 000 realizations of 
a field with N — 32 grid points and a Gaussian power spectrum 
cr 2 (k) oc exp(-(fc/£o) 2 ), whose width ko is related to the field 
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Fig. 2. Iso-probability contours of p(ft > ft) (left) and p(y3,y6) (right) for 400 000 realizations of a field with N = 32 grid points and 
a Gaussian power spectrum with Lko = 80. The dashed red contours show the best-fitting Gaussian approximations in both cases. 



length L by Lko - 80. The red contours show the correspond- 
ing Gaussian with the same mean and standard deviation. It is 
clearly seen that the Gaussian approximation is not particularly 
good. In order to check whether the Gaussian approximation is 
justified better in the new variables y,, we transform the sim- 
ulated realizations of the correlation function to y-space, using 
the constraints r, u and r,i from the covariance matrix approach, 
which we calculate exploiting the matrix methods explained in 
SH2009. The right-hand part of Fig. 2 shows the probability dis- 
tribution p(yi,y(,) and the Gaussian with mean and covariance 
as computed from {y 3, }>(,}. Obviously, the transformed variables 
follow a Gaussian distribution much more closely than the orig- 
inal correlation coefficients. 

2.4. Transformation of the PDF 

Before actually assessing how good the Gaussian approximation 
in y-space is, we will first try to make use of it and develop a 
method to obtain a better approximation of the probability dis- 
tributions in r- and, in the end, ^-space. 

Assuming a Gaussian distribution in y-space, we can calcu- 
late the PDF in r-space as 

Pr (r, , . . . , r n ) = p y (y u . . . , y„) ■ |det (J" y )\ . (20) 

Here, J r ^ y denotes the Jacobian matrix of the transformation 
defined in |Eq. (16)] so 

r ^y _ dyi_ _ dyi_dxi_ _ 1 dxj 
'■' drj dxi drj 1 - x 2 drj 

2 
(r iu - r,i) 2 - (2r,- - r, u - r,i) 2 



(r,u - r A )5ij - (n - r a )— - - (r ia - n)—±- 



dr 



drj\ 



(21) 



Since the lower and upper bounds r,i and r, u only depend on the 
rj with j < i, all partial derivatives with i < j vanish. Thus, 



J r ~* y is a lower triangular matrix, and its determinant can sim- 
ply be calculated as the product of its diagonal entries. Hence, 
the partial derivatives drn/drj and dr^ldrj of the bounds are not 
needed. 

As an example, we calculate the bivariate distribution 
p{r\,r2), assuming p(y\,y2) to be a Gaussian with mean and 
covariance matrix obtained from the 400 000 simulated realiza- 



tions. FIg3 shows p(r\ , r-i) from the simulations (solid contours) 
as well as the transformed Gaussian (dashed red contours) - ob- 
viously, it is a far better approximation of the true distribution 
than a Gaussian in r-space. The upper and lower bounds on r2 
are also shown. 

In the next step, we will show how to compute the quasi- 
Gaussian likelihood for the correlation function £, since £ (and 
not r) is the quantity that is actually measured in reality. The de- 
terminant of the Jacobian for the transformation £ — > r is simple, 
since r, = £,7£o; however, we have to take into account that in 
£-space, an additional variable, namely £0, exists. Thus, we have 
to write down the transformation as follows: 

n n 

k&, ft , .... &) n d& = /(ft , . . . , ft 1 ft) Kft) n d £ 

(=1 

n 

= p r (n , ■ ■ ■ , r„ I ft) Kft) \\ Ar i 

i=\ 
n 

= p y (y\ , ■ • • ,y n I ft) p(ft) Y\ d -y ; - ( 2T> 



/=() 



!=1 



We see that the n-variate PDF in ^-space explicitly depends on 
the distribution of the correlation function at zero lag, /;>(ft). 
In the following, we will use the analytical formula derived in 
iKeitel & S chneiden d2011h . In the univariate case, it reads 

p(£) = Yj WmiCn) - H(-£)H(-C n )} 
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(23) 



where H{^) again denotes the Heaviside step function and the 
C n are given by C„ = cr^ cos{k n x). Here, x is the lag parame- 
ter, and thus, for the zero-lag correlation function £o> C„ = cr^ 
holds. When evaluating |Eq. (23)} we obviously have to truncate 
the summation at some N'. Since this number of modes N' cor- 
responds to the number of grid points in Fourier space (which 
is half the number of real-space grid points N, but as explained 
in 

N/2-l. As[FtgT4 



Appendix A we set the highest mode to zero), we truncate 

at N' 



shows, the PDF of the {£o}-sample 
(black dots) agrees perfectly with the result from the analytical 
formula (red curve). 

As an illustrative example, we will show how to compute the 
quasi-Gaussian approximation of p(%\, £2)- In order to do so, we 
need to calculate the conditional probability p{^\ , £2^0) and then 
integrate over £o- To perform the integration, we divide the £0- 
range obtained from the simulations into bins and transform the 
integral into a discrete sum. 

Furthermore, we have to incorporate a potential in- 
dependence of the covariance matrix and mean of y. We examine 
this dependence in Fig. 6 - clearly, the covariance matrix shows 



hardly any ^-dependence, so as a first try, we treat it as indepen- 
dent of ^0 and simply use the covariance matrix obtained from 
the full sample. In contrast, the mean does show a non-negligible 
£0 -dependence, as the top panels of|Fig. 6 illustrate. However, 



this dependence seems to be almost linear, with a slope that de- 
creases for higher lag parameter. Thus, since the calculation of 
the multivariate quasi-Gaussian PDF involves computing a con- 
ditional probability with a fixed ^o-value as an intermediate step, 
we can easily handle the ^-dependence by calculating the mean 
only over realizations close to the current value of ^0 - in the ex- 
emplary calculation of p{^\,^2) discussed here, we average only 
over those realizations with a Rvalue in the current bin of the 
^-integration. 
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Fig. 4. p(^o) for a random field with N - 32 grid points and a 
Gaussian power spectrum with Lko - 80: The black dots shows 
the PDF from the simulation, whereas the red curve was calcu- 
lated from the analytical formula, using N/2 -1 = 15 modes and 
the same field length and power spectrum. 



Our final result for p(^u^i) is shown in Fig. 5 Clearly, the 



quasi-Gaussian approximation (red dashed contours) follows the 
PDF obtained from the simulations quite closely. Note that the 
quasi-Gaussian PDF was calculated using only 10 ^o-bins for the 
integration, which seems sufficient to obtain a convergent result. 
It should be noted that, although our phenomenological treat- 
ment of the ^o-dependence of mean and covariance matrix seems 
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Fig. 3. p(r\,r2) for a field with N - 32 grid points and a 
Gaussian power spectrum with Lko = 80. The dashed red con- 
tours show the transformed Gaussian approximation from y- 
space. 
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Fig. 5. p(^\,^2) for a field with N - 32 grid points and a 
Gaussian power spectrum with Lko = 80. The dashed red con- 
tours show the transformed Gaussian approximation from y- 
space. 
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Fig. 6. The mean (top row) and standard deviation (second row) of y n for different n as well as different off-diagonal elements of 
the covariance matrix C ((y„)) (third row) as functions of go, determined from simulations; the error bars show the corresponding 
standard errors (which depend on the number of realizations with a value of £q that lies in the current bin). 



to be sufficiently accurate to give satisfying results, the analyti- 
cal calculation of this dependence would of course be preferable, 
since it could improve the accuracy and the mathematical consis- 
tency of our method. We show this calculation in | Appendix B| 
However, since the results turn out to be computationally im- 
practical and also require approximations, thus preventing a gain 
in accuracy, we refrain from using the analytical mean and co- 
variance matrix. 



3. Quality of the quasi-Gaussian approximation 

For the bivariate distributions we presented so far, it is appar- 
ent at first glance that the quasi-Gaussian likelihood provides a 
more accurate approximation than the Gaussian one. However, 
we have not yet quantified how much better it actually is. 



There are, in principle, two ways to address this: For one, we 
can directly test the Gaussian approximation in y-space. We will 
do so in ISect. 3.ll making use of the fact that the moments of a 
Gaussian distribution are well-known. Alternatively, we can per- 
form tests in f-space, which we will do in lSect. 3T21 Although in 
this case, we have to resort to more cumbersome methods, this 
approach is actually better, since it allows for a direct compari- 
son to the Gaussian approximation in £-space. Furthermore, it is 
more significant, because it incorporates testing the transforma- 
tion, especially our treatment of the ^-dependence of the mean 
and covariance in y-space. 
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Fig. 7. Skewness (left-hand panel) and kurtosis (right-hand panel) of {£„} (black circles) and {y„} (red triangles) as functions of lag 
n, for a Gaussian power spectrum with Lko - 80. The blue squares show the skewness / kurtosis of Gaussian samples of the same 
length, mean and variance as the corresponding samples {y„}. 



3.1. Tests in y -space 

In order to test the quality of the Gaussian approximation in 
y-space, we calculate moments of the univariate distributions 
p(y„), as obtained from our numerical simulations. The quan- 
tities we are interested in are the (renormalized) third-order mo- 
ment, i.e. the skewness 



7 = 



(x - nY 



W?3 

3/2 



(24) 



as well as kurtosis k, which is essentially the fourth-order mo- 
ment: 



(x-pf 



3 = -4-3. 

m 2 



(25) 



Here, m,- = {(x - p)') denote the central moments. Defined like 
this, both quantities are zero for a Gaussian distribution. 

The results can be seen in |Fig. 7| The red triangles show the 
skewness / kurtosis of the 400 000 simulated {y„}-samples (again 
for a field with N - 32 grid points and a Gaussian power spec- 
trum with Lko - 80). Clearly, both moments deviate from zero, 
which cannot be explained solely by statistical scatter: The blue 
squares show the skewness / kurtosis of Gaussian samples of 
the same length, mean and variance as the corresponding sam- 
ples {y n }, and they are quite close to zero. However, it can also 
be clearly seen that the skewness and kurtosis of the univariate 
distributions in £-space (shown as black circles) deviate signifi- 
cantly more from zero, showing that the Gaussian approximation 
of the univariate distributions is far more accurate in y- than in 
£-space - this is true also for other lag parameters than the ones 
used in the examples of the previous sections. 

Note that although we used N - 32 (real-space) grid points 
for our simulations, we only show the moments up to n = 15, 
since the g„ with higher n contain no additional information due 
to the periodic boundary conditions (and thus, the corresponding 



y n have no physical meaning). Additionally, by construction, the 
yo-component does not exist. 

Since the fact that the univariate PDFs piyi) are "quite 
Gaussian" does not imply that the full multivariate distribution 
p(y) is well described by a multivariate Gaussian distribution, we 
also perform a multivariate test. While numero us tests for multi- 
variate Gaussianity exist (see e.g. Henze 2002 for a review), we 
confine our analysis to moment-based tests. To do so, it is neces- 
sary to generalize skewness and kurtosis to higher dimensions - 
we use the well-established definitions by Mardia ( 1970, 1974): 
Considering a sample of <i-dimensional vectors x,, a measure of 
skewness for a of-variate distribution can be written as 



7d 



±- 2 ±±[ {Xi -,?c-{ X] -4 



(26) 



'=1 7=1 



where n denotes the sample size, and p and C are the mean 
and covariance matrix of the sample. The corresponding kurtosis 
measure is 



Kd 



1 n 

- J] {( Xi - p) T C- 1 ( Xi - p)f - d(d + 2), 

/=1 



(27) 



where we subtracted the last term to make sure that the kurtosis 
of a Gaussian sample is zero. 



Fig. 8 shows the results of the multivariate test. For the same 



simulation run as before (using only 5000 realizations to speed 
up calculations), the skewness and kurtosis of the n-variate dis- 
tributions, i.e. p(^o, ■ ■ •i&s-i) (black circles) and p(yi,...,y n ) 
(red, upward triangles) are plotted as a function of n. For com- 
parison, the blue squares show the moments of the n-variate 
distributions marginalized from a 15-variate Gaussian. It is ap- 
parent that also in the multivariate case, the assumption of 
Gaussianity is by far better justified for the transformed vari- 
ables y than for f, although the approximation is not perfect. 

To avert comparing PDFs of different random variables 
(namely £ and y), we perform an additional check: We draw a 1 5- 
variate Gaussian sample in ^-space and transform it to y-space 
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n 



n 



Fig. 8. Mardia's skewness (left-hand panel) and kurtosis (right-hand panel) of n-variate {£}- (black circles) and {y}-samples (red, 
upward triangles) for a Gaussian power spectrum with Lko - 80. The green, downward triangles show the skewness / kurtosis of 
{y}-samples obtained under the assumption of a Gaussian {£ }-sample, and the blue squares show the skewness / kurtosis of Gaussian 
samples. See text for more details. 



(using only the realizations which lie inside the constraints); 
the corresponding values of skewness and kurtosis are shown as 
green, downward triangles. Clearly, they are by far higher than 
those obtained for the "actual" y-samples, further justifying our 
approach. 



Here, we take p to be the "true" PDF as obtained from the simu- 
lations, and q the approximation. In order to calculate them, we 
introduce binning, meaning that we again split the range of £1 in 
bins of width A^ and discretize Eq. (28) The values we obtain 



3.2. Tests in %-space 

In the following, we will directly compare the "true" distribu- 
tion of the correlation functions as obtained from simulations 
to the Gaussian and the quasi-Gaussian PDF. As an example, 
we consider the univariate PDF p(^i). [Fig. 9| shows the different 
distributions - it is clear by eye that the quasi-Gaussian PDF is 
a far better approximation for the true (black) distribution than 
the best-fitting Gaussian (shown in blue and dot-dashed). In ad- 
dition, it becomes clear that we can improve our results by in- 
corporating the ^o-dependence of the covariance matrix: While 
the assumption of a constant covariance matrix already yields a 
quasi-Gaussian (shown as the green dotted curve) which is close 
to the true distribution, the red dashed curve representing the 
quasi-Gaussian which incorporates the ^-dependence of the co- 
variance matrix (in the same way as was done for the mean) is 
practically indistinguishable from the true PDF. In the following, 
we would like to quantify this result. 

While a large variety of rigorous methods for comparing 
probability distributions exist, we will at first present an intu- 
itive test: A straightforward way of quantifying how different 
two probability distributions p(g) and q{^) are is to integrate over 
the absolute value of their difference, defining the "integrated 
difference" 




/ 



d£ \p(g) - q(g)\ . 



(28) 



Fig. 9. p{^i) for a field with N — 32 grid points and a Gaussian 
power spectrum with LIcq = 80. The black curve is from simula- 
tions, the blue dot-dashed curve shows the best-fitting Gaussian, 
and the other two curves show the quasi-Gaussian approxima- 
tion using a constant covariance matrix (green dotted curve) and 
incorporating its ^-dependence (red dashed curve). 
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show a slight dependence on the number of bins - the reason is 
Poisson noise, since Ai nt is not exactly zero even if the approx- 
imation is perfect, i.e. if the sample with the empirical PDF p 
was indeed drawn from the distribution q. The dependence on 
the binning becomes more pronounced in the multivariate case, 
but in the univariate example we consider here, A; nt provides a 
coherent way of comparing the distributions. Namely, if we di- 
vide the range of ft into 100 bins (the same number we used 
for Fig. 9 1, we obtain Aj nt w 0.18 for the Gaussian PDF and 
Ai nt ~ 0.025 (0.011) for the quasi-Gaussian PDF with constant 
(ft-dependent) covariance matrix, thus yielding a difference of 
more than one order of magnitude in the latter case. 

A well-established measure for the "distance" between two 
probability distributions p and q is the Kullback-Leibler (K- 
L) divergence, which is defined as the difference of the cross- 
entropy r H x of p and q and the entropy 'H of p: 



D KL (p,q) = 'H X (p,q)-<H(p) 



-I 



d^(ftlog^§. 



(29) 



As before, the values obtained vary slightly with the number of 
bins. Using 100 ft -bins yields Dkl ~ 0.04 for the Gaussian and 
Dkl ~ 0.001 for quasi-Gaussian PDF with a constant covariance 
matrix. Here, incorporating the ft-dependence of the covariance 
matrix has an even stronger impact, resulting in Dkl ~ 0.0002. 
Hence, the K-L divergence gives us an even stronger argument 
for the increased accuracy of the quasi-Gaussian approximation 
compared to the Gaussian case. 

4. Impact of the quasi-Gaussian likelihood on 
parameter estimation 

In this section, we will check which impact the new, quasi- 
Gaussian approximation of the likelihood has on the results of 
a Bayesian parameter estimation analysis. As data, we generate 
400 000 realizations of the correlation function of a Gaussian 
random field with N = 64- grid points and a Gaussian power 
spectrum with Lko - 100. From this data, we wish to obtain in- 
ference about the parameters of the power spectrum (i.e. its am- 
plitude A and its width ko) according to |Eq. (1)] so — (A,ko), 
To facilitate this choice of parameters, we parametrize the power 
spectrum as P(k) - A ■ 100/ (Lko) ex p{ - (£Ao) 2 ), where we 
choose A - I Mpc and ho = 1 Mpc~' as fiducial values (cor- 
responding to L ko = 100 and a field length L - 100 Mpc). Since 
we use a flat prior for 9 and the denominator (the Bayesian evi- 
dence) acts solely as a normalization in the context of parameter 
estimation, the shape of the posterior p(0\g) is determined en- 
tirely by the likelihood. 

For the likelihood, we first use the Gaussian approximation 



£(6) = p(f\0) = 



1 



(2n)"' 2 V^tQ 
x exp j-i (£ (6) - ft d ) T • Cf ■ (f (0) - fr d ) j , 



(30) 



where £ = (ft, • ■ ■ ,ft-i) an d Q denotes the covariance matrix 
computed from the {ft-sample; note that we use only n — 32 
^-components, since the last 32 components yield no additional 
information due to periodicity. In an analysis of actual data, the 
measured value of the correlation function would have to be in- 
serted as mean of the Gaussian distribution - since in our toy- 
model analysis, we use our simulated sample of correlation func- 
tions as data, we insert £gd = £ (0m) instead, which is practically 
identical to the sample mean of the generated realizations. 



For a second analysis, we transform the simulated realiza- 
tions of the correlation function to >'-space and adopt the quasi- 
Gaussian approximation, meaning that we calculate the likeli- 
hood as described in lSect. 21 



p(W = 



1 



(2to ( "- 1)/2 vaitc; 
i 



x exp l--(y(0)-<y (ft))) 1 • C,: 1 • (y (6) - (y (ft))) 



p(ft)-|det(7^)|. 



(31) 



Here, instead of inserting the fiducial value y^i as "measured 
value", we incorporate the ft-dependence of the mean (y) in the 
same way as previously described, meaning by calculating the 
average only over those realizations with a ft-value close to the 
"current" value of ft, which is the one determined by the fixed 
value of 6, i.e. ft(0). In contrast to that, C, denotes the covari- 
ance matrix of the full y-sample, meaning that we neglect its 
ft-dependence, although we have shown in lSect. 3T21 that incor- 
porating this dependence increases the accuracy of the quasi- 
Gaussian approximation. The reason for this is that for some 
values of ft, the number of realizations with a ft-value close 
to it is so small that the sample covariance matrix becomes sin- 
gular. However, since the toy-model analysis presented in this 
section is about a proof of concept rather than about maximizing 
the accuracy, this is a minor caveat - of course, when apply- 
ing our method in an analysis of real data, the ft-dependence 
of C y should be taken into account. It should also be mentioned 
that apart from the ft-dependence, we also neglect any possible 
dependence of the covariance matrix on the model parameters 0, 
since this is not expected to have a strong influence on parameter 
estimation (for example in the case of BAO studies, Labati e et al.l 
2012b show that even if C does have a slight dependence on the 
model parameters, incorporating it in a Bayesian analysis only 
has a marginal effect on cosmological parameter constraints). 

The ^-dependence of the last two terms also merits some 
discussion, in particular, the role of the fiducial model param- 
eters 0^ has to be specified: While the Gaussian likelihood 
discussed previously is of course centered around the fiducial 
values by construction, since £(6m) is inserted as mean of the 
Gaussian distribution, this cannot be done in the case of the 
quasi-Gaussian likelihood due to its more complicated mathe- 
matical form. 

The p(ft)-term in |Eq. (3 PJI can be treated in a straightforward 
way: Namely, we fix the shape of the PDF p(ft) by determining 
it from the fiducial power spectrum parameters 6m and then eval- 
uate it at the current value ft(0). Thus, we compute this term as 
/?e lid (ft(0)) - in other words, we use 6^ to fix C„ as well as C m 
and then evaluate Eq. (23)| at ft(#). 

The last term, i.e. the determinant of the transformation ma- 
trix, however, has to be evaluated for the fiducial value, yielding 
| det(7^^ v (0gd))|. Thus, this term has no ^-dependence at all and 
plays no role in parameter estimation; similarly to the Bayesian 
evidence, it does not have to be computed, but can rather be un- 
derstood as part of the normalization of the posterior. This can be 
explained in a more pragmatic way: Assume that a specific value 
of £ has been measured and one wants to use it for inference 
on the underlying power spectrum parameters, incorporating the 
quasi-Gaussian likelihood. Then one would transform the mea- 
surement to y-space and rather use the resulting ^-vector as data 
in the Bayesian analysis than ft and thus, the | det /|-term would 
not even show up when writing down the likelihood. Here, we 
nonetheless included it in order to follow the train of thought 
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from lSect. 2l and for better comparison with the Gaussian likeli- 
hood in f-space. 

Note that the previous argument cannot be applied to the 
/?(£o)-term: Calculating it as p(^o(6m)\Om) (i.e. as independent 
of 6, making it redundant for parameter estimation), would yield 
biased results for two reasons: First, the transformation from £ 
to y involves only ratios gi/go, which means that one would ne- 
glect large parts of the information contained in £o - in the case 
discussed here, this would lead to a large degeneracy in the am- 
plitude A of the power spectrum. Furthermore, incorporating the 
non-negligible ^-dependence of the mean (y) immediately re- 
quires the introduction of the conditional probability in |Eq. (22)] 
thus automatically introducing the p(fo)-term. 

where the 



The resulting posteriors can be seen in Fig. 10 



left-hand panel shows the result for the case of a Gaussian likeli- 
hood, and the right-hand one is the result of the quasi-Gaussian 
analysis. Already for this simple toy model, the impact of the 
more accurate likelihood on the posterior is visible. The differ- 
ence may be larger for a different choice of power spectrum, 
where the deviation of the likelihood from a Gaussian is more 
pronounced. Nonetheless, it is evident that the change in the 
shape of the contours is noticeable enough to have an impact 
on cosmological parameter estimation. 



Fig. 1 1 shows the marginalized posteriors for A and ko, again 



for the Gaussian (black solid curve) and the quasi-Gaussian 
(red dot-dashed curve) case. As for the full posterior, there 
is a notable difference. While it may seem alarming that the 
marginalized posteriors in the quasi-Gaussian case are not cen- 
tered around the fiducial value, this is in fact not worrisome: 
First, as a general remark about Bayesian statistics, it is impor- 
tant to keep in mind that there are different ways of obtaining 
parameter estimates, and in the case of a skewed posterior (as 
the quasi-Gaussian one), the "maximum a posteriori" (MAP) 
is not necessarily the most reasonable one. Furthermore, in our 
case, it should again be stressed that the Gaussian likelihood is 
of course constructed to be centered around the fiducial value, 
since 0m is explicitly put in as mean of the distribution, whereas 
the quasi-Gaussian likelihood is mainly constructed to obey the 
boundaries of the correlation functions. These are mathemati- 
cally fundamental constraints, and thus, although none of the 
two methods are guaranteed to be bias-free, the quasi-Gaussian 
one should be favored. 



5. A copula approach 

In this section, we briefly investigate a potential alternative way 
of finding a new approximation for the likelihood of correlation 
functions. Since the exact univariate PDF off is known (Keitel 
& Schneider 2 0111) . using a copula approach seems to be an ob- 
vious step. According to the definition of the copula, the joint 
PDF of n random variables f ,■ can be calculated from the univari- 
ate distributions /?,(£,) as 



p(€i ,&,..., £») = c(u\, u 2 , . . . , u n ) ■ JJ ptiti), 



(32) 



;=1 



where the copula density function c depends on «,- = PSgi), i.e. 
on the cumulative distribution functions (CDFs) off,. In the sim- 
plest case, the copula function is assumed to be a Gaussian. 

Copulae have previously been used in cosmology, e.g. for 
the PDF of the density field of large-scale structure (Scherrer 
et al. 2010), o r the w eak lensing convergenc e power spectrum 
(see ISato et al.ll201ll and its companion paper ISato et al.ll2010h . 



In the case of a Ga ussian copula, th e bivariate joint PDF can be 
calculated (see e.g. lSato et al.ll201 lh as 



Ptfi.ft) = 



V _l__exp{-I(,-^C'(,- / ,) 

I ex JjliZ*t\)-\ H f ih (33 ) 



U l V2^r,- 



2crj 



where qi = O" 1 ^. (P (£;)). Here, p and C denote the mean and 
covariance matrix of the copula function, <b^ a is the Gaussian 
CDF. Note that contrary to our usual notation, here the indices 



of f are purely a numbering, and Eq. (33) can in fact be applied 
for arbitrary lags. 

To calculate the copula likelihood, we implement the analyt- 
ical univariate formulae for p,(£,) and PX£i) derived bv Keitel 
& Schneider (2011); the mean and covariance matrix of the 
Gaussian copula are calculated directly from the simulated {£}- 



sample. Fig. 12 shows the bivariate PDFs from the simulation 



(black contours) as well as the copula likelihood (red dashed 
contours) for two different combinations of lags. It is apparent 
that the copula likelihood does not describe the true PDF very 
well. In particular, it does not even seem to be a more accu- 
rate description than the simple multivariate Gaussian used in 
the left-hand panel of Fig. 2 leading us to the conclusion that 



our quasi-Gaussian approximation should be favored also over 
the copula likelihood. Of course, the accuracy of the latter might 
improve if a more realistic coupling than the Gaussian one was 
found. 



6. Box-Cox transformations 



As mentioned in lSect. ll the idea of transforming a random vari- 
able in order to Gaussianize it suggests testing the performance 
of Box-Cox transformations. They are a form o f power trans- 
forms originally introduced by Box & Cox ( 1964) and have been 
used in astronomical applications, see e.g. Joachimi et al. (201 1) 
for results on Gaussianizing the one-point distributions of the 
weak gravitational lensing convergence. 

For a sample of correlation functions {£,} at a certain lag, the 
form of the Box-Cox transformation is 



iiU,a) 



[(gi + a) A - l] I A A±G 
ln(£ + a) A = Q 



(34) 



with free transformation parameters A and a, where the case A - 
corresponds to a log-normal transformation. For the following 
statements, we determine the optimal Box-Cox parameters for 
each lag i using the proc edure of maximi z ing th e log-likelihood 
for A and a explained in I Joachimi et al] (1201 lh and references 
therein. 

Trying to Gaussianize the full n-variate distribution 
p(go,£i, ■ ■ ■ ,£n) using this method turns out to be unsuccessful, 
meaning that the multivariate moments (as defined in lSect. 3TTT > 
of the transformed quantities £ are hardly any different from 
those of the original correlation functions £ . Additionally, there 
is barely any improvement in the Gaussianity of the univari- 
ate distributions p(£i) compared to p(g). In contrast to that, the 
transformation from £, to y resulted in an improvement in skew- 
ness and kurtosis by about an order of magnitude, as described 
in lSect. 31 

An alternative approach would be to treat all univariate dis- 
tributions p(gj) independently, i.e. to try and Gaussianize them 
separately. This of course leads to lower (univariate) skewness 
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Fig. 10. Posterior probability for the power spectrum parameters (A, k^) using the Gaussian (left panel) and the quasi-Gaussian (right 
panel) likelihood. The horizontal and vertical lines are the fiducial values (1.0, 1.0). 
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Fig. 11. Marginalized posterior probabilities for the power spectrum parameters A and k^. The solid black curves are the results 
obtained from the Gaussian likelihood, whereas the red dot-dashed curves show the results from the quasi-Gaussian analysis. 



and kurtosis in the transformed quantities £,-; however, the mul- 
tivariate moments are again almost unchanged compared to the 
untransformed correlation functions. 

Thus, in summary, the quasi-Gaussian approach seems far 
more accurate than using Box-Cox transformations. This is not 
too surprising, since the latter obviously cannot properly take 
into account correlations between the different random variables 
(i.e. the £,), whereas in contrast to that, the transformation £ — > y 
is specifically tailored for correlation functions and thus mathe- 
matically better motivated than a general Gaussianizing method 
such as power transforms. 



7. Conclusions and outlook 

Based on the exact univariate likelihood derived in Keitel & 
Schneider ( 20 1 1 ) and the constraints on correlation functions de- 
rived in Schneider & Hartlap (2009), we have shown how to cal- 
culate a quasi-Gaussian likelihood for correlation functions on 
one-dimensional Gaussian random fields, which by construction 
obeys the aforementioned constraints. Simulations show that the 
quasi-Gaussian PDF is significantly closer to the "true" distri- 
bution than the usual Gaussian approximation. Moreover, it is 
also superior to a straightforward copula approach, which cou- 
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Fig. 12. p(^i,^2) (left) and p{^,^) (right) for a field with N - 32 grid points and a Gaussian power spectrum with Lko - 80. The 
dashed red contours show the approximation obtained from a Gaussian copula. 



pies the exact univariate PD F of correlation functions derived by 
iKeitel & S chneider (2011) with a Gaussian copula. When used 
in a toy-model Bayesian analysis, the quasi-Gaussian likelihood 
results in a noticeable change of the posterior compared to the 
Gaussian case, indicating its possible impact on cosmological 
parameter estimation. 

As an outlook on future work, we would like to highlight 
some possible next steps: Applying the quasi-Gaussian approach 
to real data is obviously the ultimate goal of this project, and it 
would be of greatest interest to see the influence of the new like- 
lihood on the measurement of cosmological parameters. So far, 
this would only be possible for ID random fields, e.g. measure- 
ments from the Lye forest. However, since most random fields 
relevant for cosmology are two- or three-dimensional, general- 
izing the quasi-Gaussian approach to higher-dimensional fields 
is crucial to broaden the field of applicability. As Schneider & 
Hartlap (2009) showed, the constraints on correlation functions 
obtained for the ID fields are no longer optimal in higher dimen- 
sions, so the higher-dimensional constraints need to be derived 
first - work in this direction is currently in progress. 
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Appendix A: Equivalence of the simulation 
methods 

In this appendix, we want to prove analytically that the two sim- 
ulation methods mentioned in lSect. 2"31 are in fact equivalent. 

The established method calculates the correlation function 
components g m , m = 0, . . . , TV — 1 directly from the field compo- 
nents g n , n - 0, . . . , N — 1 . Since we impose periodic boundary 
conditions on the field, this can be done using the estimator 



1 N ~ l 



(A.l) 



The real-space field components are calculated from the Fourier 
components by 



N/2 



Zl Z7T1 
exp — i]\ fi, 

j=-N/2 V 



27ri 



(A.2) 



Here, we set g n = 0, which is equivalent to the field having zero 
mean in real space. Discretization and periodicity already imply 
g n/2 — g-N/2 - still, in order not to give double weight to this 
mode, we set it to zero as well. Of course, we then have to do 
the same in our new method, i.e. Pn/i = 0. Note that for the sake 
of readability, we still include this term in the formulae of this 
work. 
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Our new method draws a realization of the power spectrum Appendix B: Analytical Calculation Of the 

and Fourier transforms it to obtain the correlation function: i -dependence of mean and covariance matrix 



2 N/2 



2nmn 

N 



As mentioned in ISect. 21 our analytical calculation of the %q- 
( a -j) dependence of the mean and the covariance matrix does not 

produce practically usable results - nonetheless, it is interest- 
In both methods, the variance cr 2 (k n ) ^ = <||„| 2 ) of the field in § from a theoretical P oint ° f view and is thus presented in this 
components in Fourier space is determined by the power spec- PP 
trum namely for one realization We are ultimately interested in the mean j> and the covariance 

matrix C y , however, we will first show calculations in ^-space 



\U 2 = \Pk- 



(A.4) 



before addressing the problem of how to transform the results to 
y-space. 



To prove the equivalence of the two methods, we insert the 



Fourier transforms as given in |Eq. (A.2)] into the estimator, B.1. Calculation in £ -space 
|Fq. (A.l)| 



& = «Z Z Z ft«p(-^*»)**«p(-^*c» + ») 

n=0 l=-N/2 k=-N/2 \ * / \ 

j AM N/2 N/2 , 2jd > 

= n Z Z Z ~ gi ~ gk exp ~w (ln +kn+ km) ■ (A - 5) 

«=() /=-W/2 k=-N/2 ' ' 



The ^o -dependence of the mean (fi) (where the index is purely 
a numbering and does not denote the lag) can be computed as 



<6>(fo) 



/ 



dft ft jKftlft) 



The sum over « simply gives a Kronecker 5, since 

AM 

/ ';TI i 



with the conditional probability 



Kftlfo) 



^Z ex p& (/ -^ 

Thus, we obtain 

AT/2 AT/2 

fo. = z Z ^ exp i 

J»- A72 fc=-Ar/2 
AT/2 



P(fo) 



(B.l) 



(B.2) 



(A. 6) \\r e define the corresponding characteristic function as Fourier 
transform of the probability distribution (<£ <-> p in short-hand 
notation; for details on characteristic functions see Keitel & 
Schneider l201 ll hereafter KS201 1. and references therein): 



/27ri 



km\6- 



2^ gkg-k exp\ -rrhn 

k=-N/2 



N 



<D(*i;ft) = Jd£e ilrfl ,p(filfo) 
Kfilfo) = J | 



-e-^*<D(Ji;fo). 

27T 



(B.3) 
(B.4) 



N/2 



= Z l^*l 2 exp I -Tx-*w > 



k=-N/2 



2m 



Making use of the characteristic function ^(so, sj) (where 
(A 7) ^(sotSi) «-» /;>(fo,fi)) computed in KS2011, we can also write 



where in the last step, we used the fact that the g, are real, imply- 
ing gi - g\. In order to show that this is equivalent to |Eq. (A3)] 
we now split the sum into two parts, omitting the zero term (since 
go = 0, as we explained before): 



Ptfilfo) 



1 fdso f 

fdfLe-u.fi J_ f 
J 2;r p(fo)J 



L e -i*oft e -i*ifi T^o,*!) 



— »~" oft Y(j , Ji). (B.5) 



;V/2 



fo> = ^] l£*| 2 exp -rrfew + ^ Lg*| 2 exp —km 

k=-N/2 \ I •-' ' ' 



N/2 



2m, 



k=\ 

N/2 



1 2m 



k=\ 



= Z 1 ^ 2 exp ( _ — ^)+Z^ |2ex p(— 

N/2 . 

= 2-^|| t | 2 cos — km 

k=\ \ 

Inserting |Eq. (A.4)] we end up with 

fo, = --^VfcCOsf— few 

k=\ * 



km 



2jt p(£ ) J 2n 

Comparison with |Eq. (B.4)| yields 

<D(*i;&) ~f? e--^ ¥( J0> Sl ). (B.6) 

P(fo) J 2tt 

Now, we can calculate the mean (i.e. the first moment) from 



(A.8) 



(A.9) 



the characteristic function (equivalent to Eq. (B.l) - again, see 
KS201 1 and references there): 



(flXfo) = rj-*(5i;fo) n 



J_ fdfo e -^oJ_ ¥(i0) 
p(£o) J 2tt id*i 



*i) 



ii=0 



(B.7) 
(B.8) 



This is exactly the way we calculate £,„ in our new method - thus, 
we proved analytically that the two methods are indeed equiva- 
lent. As mentioned before, we also confirmed this fact numeri- 
cally. 



Using the result from KS2011 for the bivariate characteristic 
function, 



%si)-flf 



1 



2is C„ -2isiC n \ 



(B.9) 



13 



P. Wilking and P. Schneider: A quasi-Gaussian approximation for the probability distribution of correlation functions 
where C nm = o~l cos(k n x m ), we can calculate the derivative as 

1 



/■ 



+ (ft) (ft) (ft) (ft) dftdftMft.ftlft) 



d 

id^i 



^•*>L = Z 



2C„i 



Ur 



fri (1 - 2is C„ ) k+n 
, 1 - 2is C„ () \ \ 1 - 2i. 



ife=i 



2iioQo 
■soCjto 



'/ 



(ft) (ft) dftdftftKft.ftlft) 



*(*) X — 



2C„i 



(B.10) 



'/ 



■<&>(&) 



«=i 



(ft) (ft) dftdftftKft.ftlft). 



=<fi>(ft) 



The integral A can again be expressed in terms of the character- 
where we inserted the univariate characteristic function com- istic; function <t>(si i2'ft) *-* p(ft ftlft)' 
putedinKS2011, 



<*{*>) = n — 



i 



n=1 



2isoC n Q 



(B.ll) 



To calculate the integral in |Eq. (Bl8)] we use a Taylor expansion 
of Y„(s Q ) from |Eq. (B.10)| 



y„0o) * Yj 2k+i (Lvo) * c »o Cnl - 

We insert the derivative into |Eq. (B.8)| and thus obtain 

(ft)(ft)--^r- f^e-^TCjo) 

p (ft) J 2tt 

CO CO 

«=1 jfc=0 

According to the definition of ^(^o) <-> p(ft), 
d^(ft) 



(B.12) 



d d | 

A = 7 -———^(si,s 2 ;M 

ias\ ias2 \si=s 2 =o 

Similar to the previous calculations, 

<P(ji, 5 2 ;ft) = -L- f ^ e-'^» ¥(j , sun 

p (ft) J 2tt 

with the trivariate characteristic function 

1 



) 



T(jo,Ji,«o=n— 



2i5oC„o - 2isiC„i - 2is 2 C„ 



(B.16) 



(B.17) 



(B.18) 



Calculating the second derivative of |(B.18)1 yields 
d d 



< 



J 2tt ' 



(-iso)* e- i,oft ¥(*>), 



(B.13) id^t id«2 



(B.14) 



*F (S ,*l,*2) 



and thus, after changing the order of summation and integration, 
|Eq. (B.13)|can finally be written as 



T(*o) 



S 



7T^ 



^^nl^nl 



(1 - 2i^ C„ ) 2 

Z„(s ) 



4C„iCia 



order modes 



<fi>«» = E Z 2 "^'- 1 ^'*' ' 



(1 - 2i^oC„ ) (l -2i^oQ.o) 

Zt,„(so) 



(B.19) 



k=0 n=\ 



< Mft) 



(B. 15) The Taylor expansions of Z„(sq) and Z^{sq) read 



Inserting the known result for p(ft) and calculating its deriva- Z„(s ) ~ V 2* +2 (& + 1) (is ) k C* C„iC„ 2 , 
tives allows us to compare the analytical result to simulations. 



fc=0 



The results can be seen in Fig. B.l here, the black points with M / 

error bars show the mean of £,„ for different lags n as determined 2* (in) w V* 2 ,+2 (isn)' CwC i Y^ C p c' _/ ' 

from simulations (100 000 realizations, Gaussian power spec- ^ f^j 

trum with Lko = 100), and the colored symbols show the an- 

alytical results to different order (see figure caption). It seems Using it as well as the expansion |(B.12)[ we finally obtain 

that, although the Taylor series in |Eq. (B.15) does not converge, 

a truncation at order 10 yields sufficient accuracy, barring some e\fc\ V ^ d/»(ft) 

numerical issues for very low ^o-values. c °v\$iy$2)\$Q) - £j pCp Q \ jrf 

The ^o-dependence of the covariance matrix Q can be cal- ,1_ " 

culated in a similar way. We start from the general definition of 



(B.20) 
(B.21) 



modes 



£(-!)* 2* +2 (fc+l) 



covariance, 

cov(ft,ft)(ft) = <(ft - (ft)) (ft - <6»> 6 

■d fi d 6 (B-«» ( 6-<6»«..6tt) 



C n0 C„iC„2 + V (-1) 2 C m 2 C„i 






X Zj m0 ^«0 



p=0 



(ft)(ft)(ft)(ft)- (B.22) 



dftdftftftp(ft,ft|ft) 



We show a comparison of the results (for different elements 
of the covariance matrix) from simulations and the analytical 
formula in Fig. B.2 Again, the black dots are obtained from 
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Fig. B.l. The mean of tj„ for different n as function of fo, determined from simulations (black points with error bars) and analytically 
to zeroth (red crosses), first (blue circles), second (green filled triangles; left panel only), third (purple empty triangles; left panel 
only), and tenth (brown squares) order. 
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Fig. B.2. Different elements of the covariance matrix C({£„}), determined from simulations (black points with error bars) and ana- 
lytically to zeroth (red crosses), first (blue circles), fifth (purple triangles), and tenth (brown squares) order. 



simulations and the colored symbols represent the results from 
|Eq. (B.22)l where the last term (i.e. the one containing the mean 
values (£„)) was calculated up to tenth order, thus providing suf- 
ficient accuracy, as previously shown. As before, there are some 
numerical problems for very small values of £o- Additionally, 
the analytical results do not agree with the simulations for 
small lags, as can be seen from the left-most panel (the same 
holds for other covariance matrix elements involving small lags). 
However, for the the higher-lag examples (i.e. the right two pan- 
els), a truncation of the Taylor series at tenth order seems to be 
accurate enough. 



B.2. Transformation of mean and covariance matrix to 
y-space 

In the previous section, we showed how to calculate the (£o- 
dependent) mean and covariance matrix in £-space. The compu- 
tation of the quasi-Gaussian approximation, however, requires 
the mean y and the covariance matrix C y in y-space, which can- 
not be obtained from those in ^-space in a trivial way due to the 
highly non-linear nature of the transformation ^ — > y. 

Thus, instead of settling for a linear approximation, we have 
to choose a more computationally expensive approach. Namely, 
we calculate the first and second moments (in £) of the quasi- 
Gaussian distribution as functions of the mean and (inverse) co- 
variance matrix in y-space and equate the result to the analytical 



results, i.e. we solve a set of equation of the form 



(B.23) 
(B.24) 



where we did not write down the ^-dependence explicitly for 
the sake of readability. 

Note that this is a complicated procedure, since the integra- 
tion on the equations' left-hand sides can only be performed 
numerically (we make use of a Monte-Carlo code from Press 
et al. 2007). In order to solve the equation set (consisting of 
N + jN(N +1) equation for an Af-variate distribution) we use 
a multi-dimens ional root-find ing algorithm (as provided within 
the GSL. lGalassi et al1|20 09). However, due to the high dimen- 
sionality of the problem, this procedure does not seem practical, 
since it is computationally very expensive - in addition to that, 
any possible gain in accuracy is averted by the required heavy 
use of purely numerical methods. 

Thus, as described in lSect. 21 we refrain from using our an- 
alytical results for the mean and covariance matrix and simply 
determine them (as well as their £o -dependence) from simula- 
tions, which we have shown to be sufficiently accurate. 
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